Motivation

Here is my attempt to tell you story, that I received from the data. Where is the highest percent of dismissals from work? How does overtime connect with attrition? Is gender really matter? And many more, if you sit dawn and spend some of you time to read my report)

Prepare data for EDA

So, when I usually begin to discover history from data, I spend many hours prepare and cleaning dataset, but not now! Today I have nice, prepared dataset, and need only to do some magic staff in case of better visualization.

Ordinal data

Let’s begin from ordinal data, just becouse I want :). Let’s see how many attrition in each category of each parameter we have.

So what we see here? We undersand, that most of employees travel rarely, work in RnD department have bachelor or master degree and etc. But there are some more intresting facts. For exmaple, why we see only two category of performance rating? Is it really true that in IBM all employees do their job so well? Also we se that there are unusual high people that in OverTime category that quit from work. But to tell you the truth,it’s a little bit difficult to understand picture due to imbalanced classes… Let’s invite old friend percent to this party!

Now we can made absolutly logical conclision that people with frequency bisness trips,low JobInvolment, low Environment, Job, Relotionship satisfaction and with bad WorkLifeBalance and zero StockOptionLevel more often leave company that the others. One can think that this much trivial conculision, but I must proof it by data, so I do. What about more specific insights? I have it for you) Look carefully on JobRole category. There are something wrong with Sales Representative. Maybe they have terrible directer, or maybe they have to OverTime often.By the way, look at OverTime section. It is strange, that peoples, who overtime, leave company often, then others. That’s maybe mean that peopls work hard but don’t feel any feedback from company. Now let’s see top 5 categories with biggest attrition.

Here we see,that categore with highest attrition is JobRole:SalesRepresentive with 33 people that leave company vs 50 that stay. Suprising result is that people with high (24%) PercentSalaryHike leave company, and small numbers of data points give me a thought, that this is just fluctuation and nothing more, but we will keep in mind this fact. Now lets dive a little more dipper in details. Let’s see how attrition connect with OverTime and sompe other parameters

On each picture on x-axis we have YearsAtCompany for different parameters and different Atttion(on left Attrition = No, on left Attrition = Yes). On first row two pics are almost logical, but people,that don’t overTime get more salary on the average. On second row we see, that peopls, that work harder, get less percentSalaryHike, that who don’t do overTime on the average. Maybe this is effect of averaging, maybe bad work of HR department. On last row we see another intresting fact: people, who leave company and work hard, get promotion less often, instade of peopls, that stay at company. Of course, this is maybe avarage effect, so CEO IBM shouldn’t dismisall all HR department :).